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We present a detailed examination of the heavy flavor content of the 
W + jet data sample collected with the CDF detector during the 1992-1995 
collider run at the Fermilab Tevatron. Jets containing heavy flavor quarks 
are selected via the identification of secondary vertices or semileptonic decays 
of b and c quarks. There is generally good agreement between the rates of 
secondary vertices and soft leptons in the data and in the standard model 
simulation including single and pair production of top quarks. An exception 
is the number of events in which a single jet has both a soft lepton and a 
secondary vertex tag. In W+ 2,3 jet data, we find 13 such events where we 
expected 4.4 ± 0.6 events. The kinematic properties of this small sample of 
events are statistically difficult to reconcile with the simulation of standard 
model processes. 

PACS number(s): 13.85.Qk, 13.38.Be, 13.20.He 

I. INTRODUCTION 

The production of W bosons in association with jets in pp collisions at the Fermilab 
Tevatron Collider provides the opportunity to test many standard model (SM) []]] predic- 
tions. Previous CDF measurements of the inclusive W cross section and of the yield of 
W + jet events as a function of the jet multiplicity and transverse momentum show agree- 
ment between data and the electroweak and QCD predictions of the standard model. In this 
study we extend the analysis of the jets associated with W boson production to include the 
properties of heavy flavor jets identified by the displaced vertex or the semileptonic decay 
of charmed and beauty quarks. 

The present data set consists of 11,076 W — > tv (£ = e or fi) candidates produced in 
association with one or more jets selected from 105 ± 4.0 pb -1 of data collected by the CDF 
experiment at the Fermilab Tevatron ||. The b and c-quark content of this data set has 
been evaluated several times as we improved our understanding of systematic effects |4f [7| . 
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We use two different methods for identifying (tagging) jets produced by these heavy quarks. 
The first method uses the CDF silicon microvertex detector (SVX) to locate secondary 
vertices produced by the decay of b and c-hadrons in a jet. These vertices (SECVTX tags) 
are separated from the primary event vertex as a result of the long b and c-hadron lifetimes. 
The second technique is to search a jet for leptons (e or //) produced by the semileptonic 
decay of b and c-hadrons. We refer to these as "soft lepton tags" (SLT's) because these 
leptons typically have low momentum compared to leptons from W decays. Heavy flavors 
in W+ jet events are mainly contributed by the production and decay of top quarks, by 
direct Wc production, and by the production of Wg states in which the gluon branches into 
a heavy-quark pair (gluon splitting). 

A recent comparison between measured and predicted rates of W + jet events with 
heavy flavor as a function of the jet multiplicity is presented in Ref. |7j. The focus of that 
paper, as well as previous CDF publications |§-|6|, is the measurement of the tt production 
cross section. By attributing all the excess of W+ > 3 jet events with a SECVTX tag over 
the SM background to tt production, we find a t i = 5.08 ± 1.54 pb in good agreement with 
the average theoretical prediction which is 5.1 pb with a 15% uncertainty ||. We derive a 
numerically larger but not inconsistent value of the cross section, a t i = 9.18 ± 4.26 pb, when 
using events with one or more SLT tags. The D0 collaboration has also measured the tt 
production cross section using various techniques ||. D0 has no measurement based upon 
displaced secondary vertices, but using W+ > 3 jet events with a muon tag finds a t i = 8.2 
± 3.5 pb. In the present study, we adopt a different approach to the study of the W+ jet 
sample and use the theoretical estimate of <j t t to test if the SM prediction is compatible with 
the observed yield of different tags as a function of the jet multiplicity. This is of interest for 
top quark studies and searches for new physics, since some mechanisms proposed to explain 
electroweak symmetry breaking, such as the Higgs mechanism pi or the dynamics of a new 



interaction [IT] , predict the existence of new particles which can be produced in association 
with a W boson and decay into bb. 

Following a description of the CDF detector in Section fj, Section [Ell] describes the 



triggers and the reconstruction of leptons, jets and the missing transverse energy. The 
selection of the W+ jet sample is described in Section [TV], which also contains a discussion 
of the algorithms used for the heavy flavor identification followed by a description of the 
Monte Carlo generators and the detector simulation used to model these events. In Section [V] 
we summarize the method used in Ref. J7[ to predict the number of W + jet events with 
heavy flavor and then compare the observed yield of different tags as a function of the 
jet multiplicity to the SM prediction including single and pair production of top quarks. 
Following this comparison, in Section [VI] we study the yield of W + jet events with a 
SECVTX and a SLT tag in the same jet (supertagpj); jets with a supertag will be referred 
to as superjets in the following. Since the semileptonic branching ratios of b and c-hadrons 
are very well measured |12[, the measurement of the fraction of jets tagged by SECVTX 
which contain a soft lepton tag provides an additional test of our understanding of the 
heavy flavor composition of this data sample. The number of these events in the W + 2 and 
W + 3 jet topologies is larger than the SM prediction. In Section |V1I| we compare kinematic 
distributions of the events with a superjet to the simulation prediction. As a check, we also 
compare the simulation to a complementary sample of data. We find that the SM simulation 
models well the kinematics of the complementary sample, but does not describe properly 
the characteristics of the events with a superjet. Some properties of the primary and soft 



leptons are discussed in Section |VIII| , while Section [IX| contains a study of other properties 
of the superjets. In Section |X| we investigate the dependence of this study on the criteria 



used to select the data. Section XI summarizes our conclusions. 



The prefix "super" is used as a generalized term of high quality for historical reasons and is not 
meant as a reference to supersymmetry. 
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II. THE CDF DETECTOR 



CDF is a general purpose detector designed to study pp interactions. A complete descrip- 
tion of CDF can be found in Refs. |4,13|. The detector components most relevant to this 
analysis are summarized below. CDF has azimuthal and forward-backward symmetry. A 
superconducting solenoid of length 4.8 m and radius 1.5 m generates a 1.4 T magnetic field. 
Inside the solenoid there are three types of tracking chambers for detecting charged particles 
and measuring their momenta. A four-layer silicon microstrip vertex detector surrounds the 
beryllium beam pipe of radius 1.9 cm. The SVX has an active length of 51 cm; the four 
layers of the SVX are at distances of 2.9, 4.2, 5.5 and 7.9 cm from the beamline. Axial 
microstrips with 60 /im pitch provide accurate track reconstruction in the plane transverse 



to the beam | 14| . Outside the SVX there is a vertex drift chamber (VTX) which provides 
track information up to a radius of 22 cm and for pseudo-rapidity \r]\ < 3.5. The VTX mea- 
sures the ^-position (along the beamline) of the primary vertex. Both the SVX and VTX are 
mounted inside the CTC, a 3.2 m long drift chamber with an outer radius of 132 cm contain- 
ing 84 concentric, cylindrical layers of sense wires, which are grouped into alternating axial 
and stereo superlayers. The solenoid is surrounded by sampling calorimeters used to mea- 
sure the electromagnetic and hadronic energy of jets and electrons. The calorimeters cover 
the pseudo-rapidity range \t)\ < 4.2. The calorimeters are segmented into r]-(f) towers which 
point to the nominal interaction point. There are three separate 77-regions of calorimeters. 
Each region has an electromagnetic calorimeter [central (CEM), plug (PEM) and forward 
(FEM)] and behind it a hadron calorimeter [CHA, PHA and FHA, respectively]. Located 
six radiation lengths inside the CEM calorimeter, proportional wire chambers (CES) pro- 
vide shower-position measurements in the z and r — <p view. Proportional chambers (CPR) 
located between the solenoid and the CEM detect early development of electromagnetic 
showers in the solenoid coil. These chambers provide r — information only. 

The calorimeter acts as a first hadron absorber for the central muon detection system 
which covers the pseudo-rapidity range \r)\ < 1.0. The CMU detector consists of four layers 



9 



of drift chambers located outside the CHA calorimeter. This detector covers the pseudo- 
rapidity range \r/\ < 0.6 and can be reached by muons with p T > 1.4 GeV/c. The CMU 
detector is followed by 0.6 m of steel and four additional layers of drift chambers (CMP). 
The CMX system of drift chambers extends the muon detection to \r]\ <1.0. 

III. DATA COLLECTION AND IDENTIFICATION OF JETS AND LEPTONS 

The selection of W+ jet events is based upon the identification of electrons, muons, 
missing energy, and jets. Below we discuss the criteria used to select these objects. 

A. Triggers 

The data acquisition is triggered by a three-level system designed to select events that 
can contain electrons, muons, jets, and missing transverse energy ($ T ). 

Central electrons are defined as CEM clusters with E T > 18 GeV and a reconstructed 
track with px > 13 GeV/c pointing to it. The ratio of hadronic to electromagnetic energy in 
the cluster (E had /E em ) is required to be less than 0.125. Plug electrons, used for checks, have 
a higher transverse energy threshold (Et > 20 GeV). The inclusive muon trigger requires 
a match of better than 10 cm in rA0 between a reconstructed track with p T > 18 GeV/c, 
extrapolated to the radius of the muon detector, and a track segment in the muon chambers. 
Calorimeter towers are combined into electromagnetic and jet-like clusters by the trigger 
system, which also provides an estimate of Ex. Trigger efficiencies have been measured 
using the data and are included in the detector simulation. 

B. Electron selection 

We use electrons in the central pseudo-rapidity region (\r}\ < 1.0). Stricter selection 
cuts are applied to central electrons which passed the trigger prerequisites. The following 
variables are used to discriminate against charged hadrons: (1) the ratio of hadronic to 
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electromagnetic energy of the cluster, Eh a d/E em ; (2) the ratio of cluster energy to track 
momentum, E/P; (3) a comparison of the lateral shower profile in the calorimeter cluster 
with that of test-beam electrons, L s ftn (4) the distance between the extrapolated track- 
position and the CES measurement in the r — <ft and z views, Ax and Az, respectively; 
(5) a x 2 comparison of the CES shower profile with that of test-beam electrons, X 2 stTiv \ (6) 
the distance between the interaction vertex and the reconstructed track in the z-direction, 
^-vertex match; and (7) the isolation, /, defined as the ratio of additional transverse energy 



in a cone of radius R = y (A0) 2 + (A77) 2 = 0.4 around the electron direction to the electron 
transverse energy. Fiducial cuts on the shower position measured by the CES are applied to 
ensure that the electron candidate is away from calorimeter boundaries and therefore provide 
a reliable energy measurement. Electrons from photon conversions are removed with high 
efficiency using the tracking information in the event. A more detailed description of the 
primary electron selection can be found in Refs. [§],[7]]. 

The 7] coverage for electron detection is extended by using the plug calorimeter. When 
selecting plug electrons we replace the variables L shr , x 2 s tri V i Ax, and Az used for central 
electrons with the x 2 comparison of the longitudinal and transverse shower profiles, Xdepth 
and Xtransv, respectively. We require xlepth < 15 and Xtransv < 3 - We do not use the E/P cat, 
as the momentum measurement is not accurate at large rapidities. However, we require that 
a track pointing to the electromagnetic cluster has hits in at least three CTC axial layers. 
We also require that the ratio of the number of VTX hits found along the electron path 
to the predicted number be larger than 50%. Because of the CTC geometrical acceptance 
and of fiducial cuts to ensure a reliable energy measurement, the effective coverage for plug 
electrons is 1.2 < \r)\ < 1.5. 



Muons are identified in the pseudo-rapidity region \r]\ < 1.0 by requiring a match between 
a CTC track and a track segment measured by the CMU, CMP or CMX muon chambers. 




C. Muon selection 
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The following variables are used to separate muons from hadrons interacting in the 
calorimeter and cosmic rays: (1) an energy deposition in the electromagnetic and hadronic 
calorimeters characteristic of minimum ionizing particles, E em and E^ad, respectively; (2) 
the distance of closest approach of the reconstructed track to the beam line (impact param- 
eter), d; (3) the z- vertex match; (4) the distance between the extrapolated track and the 
track segment in the muon chamber, Ax = rA0; and (5) the isolation I. A more detailed 
description of the primary muon selection can be found in Refs. j|,|7j . Selection efficiencies 
for electrons and muons in the simulation are adjusted to those of Z — > ii events in the 
data. 



D. Loose leptons 

In order to be more efficient in rejecting events containing two leptons from Z decays, 
ti decays and other sources we use looser selection criteria to search for additional isolated 
leptons. These selection criteria are described in detail in Ref. [[/]]. 



E. Jet identification and corrections 

Jets are reconstructed from the energy deposited in the calorimeter using a clustering 
algorithm with a fixed cone of radius R = 0.4 in the rj — <p space. A detailed description 



of the algorithm can be found in Ref. | 15| . Jet energies can be mismeasured for a variety 
of reasons (calorimeter non-linearity, loss of low momentum particles because of bending in 
the magnetic field, contributions from the underlying event, out-of-cone losses, undetected 
energy carried by muons and neutrinos). Corrections, which depend on the jet Et and 77, 
are applied to jet energies; they compensate for these mismeasurements on average but do 
not improve the jet energy resolution. We estimate a 10% uncertainty on the corrected jet 
energy fljl6fl. Where appropriate, we apply additional corrections to jet energies in order to 



extrapolate on average to the energy of the parton producing the jet [010,18 
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F. Et Measurement 



The missing transverse energy (Et) is defined as the negative of the vector sum of the 
transverse energy in all calorimeter towers with \rj\ < 3.5. For events with muon candidates 
the vector sum of the calorimeter transverse energy is corrected by vectorially subtracting 
the energy deposited by the muon and then adding the pt of the muon as measured by the 
tracking detectors. This is done for all muon candidates with p? > 5 GeV/c and I < 0.1. 
When jet energy corrections are used, the Ex calculation accounts for them as detailed in 
Ref. [0. 



IV. THE W + JET SAMPLE 

The W selection requires an isolated, / < 0.1, electron (muon) to pass the trigger and 



offline requisites outlined in Section III, and also to have Et > 20 GeV (pt > 20 GeV/c). 
We require the z-position of the event vertex (Z VTtx ) to be within 60 cm of the center of 
the CDF detector. We additionally require JZl T > 20 GeV to reduce the background from 
misidentified leptons and semileptonic 6-hadron decays. Events containing additional loose 
lepton candidates with isolation I < 0.15 and pt > 10 GeV/c are removed from the sample. 
We bin the W candidate events according to the observed jet multiplicity (a jet is a R = 0.4 
cluster with uncorrected Et > 15 GeV and \r]\ < 2.0). 

The heavy flavor content of the W+ jet sample is enhanced by selecting events with jets 
containing a displaced secondary vertex or a soft lepton. 



A. Description of the tagging algorithms 

The secondary vertex tagging algorithm (SECVTX) is described in detail in Refs. 
SECVTX is based on the determination of the primary event vertex and the reconstruction 
of additional secondary vertices using displaced tracks contained inside jets. The search for a 
secondary vertex in a jet is a two-stage process. In both stages, tracks in the jet are selected 
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for reconstruction of a secondary vertex based on the significance of their impact parameter 
d with respect to the primary vertex, d/a^, where is the estimated uncertainty on d. The 
first stage requires at least three candidate tracks for the reconstruction of the secondary 
vertex. Tracks consistent with coming from the decay K s — ► 7r + 7r~ or A — > n~p are not used 
as candidate tracks. Two candidate tracks are constrained to pass through the same space 
point to form a seed vertex. If at least one additional candidate track is consistent with 
intersecting this seed vertex, then the seed vertex is used as the secondary vertex. If the 
first stage is not successful in finding a secondary vertex, a second pass is attempted. More 
stringent track requirements (such as d/ud and pt) are imposed on the candidate tracks. 
All candidate tracks satisfying these stricter criteria are constrained to pass through the 
same space point to form a seed vertex. This vertex has an associated x 2 . Candidate tracks 
that contribute too much to the x 2 are removed and a new seed vertex is formed. This 
procedure is iterated until a seed vertex remains that has at least two associated tracks and 
an acceptable value of \ 2 ■ 

The decay length of the secondary vertex L xy is the projection in the plane transverse 
to the beam line of the vector pointing from the primary vertex to the secondary vertex 
onto the jet axis. If the cosine of the angle between these two vectors is positive (negative), 
then L xy is positive (negative). Most of the secondary vertices from the decay of b and c- 
hadrons are expected to have positive L xy ; conversely, secondary vertices constructed from a 
random combination of mismeasured tracks (mistags) have a symmetric distribution around 
L xy =0. To reduce the background, a jet is considered tagged by SECVTX if it contains a 
secondary vertex with — — > 3.0, where gi, is the estimated uncertainty on L xy (typically 
about 130 /im). The mistag contribution to positive SECVTX tags is evaluated using a 
parameterization derived from negative tags in generic-jet data [J]. 

A second 6-tagging method is represented by the jet-probability (JPB) algorithm de- 
scribed in detail in Ref. []7|. This tagging method compares track impact parameters to 
measured resolution functions in order to calculate for each jet a probability that there are 
no long-lived particles in the jet cone. The sign of the impact parameter is defined to be 
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positive if the point of closest approach to the primary vertex lies in the same hemisphere as 
the jet direction, and negative otherwise. Jet-probability is defined using tracks with posi- 
tive impact parameter; we also define a negative jet-probability where we select only tracks 
with negative impact parameter in the calculation. Jet-probability is uniformly distributed 
for light quark or gluon jets, but is very small for jets containing displaced vertices from 
heavy flavor decays. A jet has a positive (negative) JPB tag if a jet-probability value smaller 
than 0.05 is derived using at least two tracks with positive (negative) impact parameter. 

An alternative way to tag b quarks is to search a jet for soft leptons produced by b — > Ivc 
or b —>■ c — > Ivs decays. The soft lepton tagging algorithm is applied to sets of CTC tracks 
associated with jets with E? > 15 GeV and \rj\ <2.0. CTC tracks are associated with a jet 
if they are inside a cone of radius 0.4 centered around the jet axis. In order to maintain 
high efficiency, the lepton p? threshold is set low at 2 GeV/c. 

To search for soft electrons the algorithm extrapolates each track to the calorimeter and 
attempts to match it to a CES cluster. The matched CES cluster is required to be consistent 
in shape and position with the expectation for electron showers. In addition, it is required 
that 0.7 < E/P < 1.5 and E^/ 'E em < 0.1. The track specific ionization (dE/dx), measured 
in the CTC, is required to be consistent with the electron hypothesis. Electron candidates 
must also have an energy deposition in the CPR corresponding to that left by at least four 
minimum-ionizing particles. The efficiency of the selection criteria has been determined 
using a sample of electrons produced by photon conversions @]. 

To identify soft muons, track segments reconstructed in the CMU, CMP or CMX systems 
are matched to CTC tracks. Only the CMU or CMX systems are used to identify muons 
with 2 < pt < 3 GeV/c. Muon candidate tracks with px > 3 GeV/c within the CMU 
and CMP fiducial volume are required to match to track segments in both systems. The 
reconstruction efficiency has been measured using samples of muons from J/ip —>■ fi + n~ and 
Z — > decays [Q. 

In the data, the rate of fake soft lepton tags which are not due to heavy flavor semileptonic 
decays is evaluated using a parameterization of the SLT fake probability per track as a 
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function of the track isolation and px- This parameterization has been derived in a large 
sample of generic-jet data [|J after removing the fraction of soft lepton tags contributed 
by heavy flavor (about 26%) 0. In the simulation, a SLT track is required to match at 
generator level a lepton coming from a b or c-hadron decay 0]. 



B. Monte Carlo generators and detector simulation 

We use three different Monte Carlo generators to estimate the contribution of SM pro- 
cesses to the W+ jet sample. The settings and the calibration of these Monte Carlo gener- 
ators are described in Ref. R. 



A few processes, including tt production, are evaluated using version 5.7 of pythia [|19 
These processes are detailed in the next section. 

The fraction of W+ jet direct production with heavy flavor, namely pp —>■ Wg with 
g — ► bb, cc (gluon splitting) and pp — > Wc, is calculated using version 5.6 of the herwig 
generator P0"| . The part of the phase space region of these hard scattering processes that 
is not correctly mapped by herwig (namely Wbb and Wcc events in which the two heavy 
flavor partons produce two well separated jets) is evaluated using the vecbos generator []2l] . 
VECBOS is a parton-level Monte Carlo generator and we transform the partons produced 
by vecbos into hadrons and jets using herwig adapted to perform the coherent shower 
evolution of both initial and final state partons from an arbitrary hard-scattering subpro- 
cess p2 |. In summary, we use herwig to predict the fraction of W+ > 1 jet events where 



only one jet contains b or c-hadrons while we rely on vecbos to extend the prediction to the 
cases where two different jets contain heavy-flavored hadrons. The MRS Dq set of structure 
functions |23] is used with these generators. We set the 6-mass value to 4.75 GeV/c 2 and 
the c-mass value to 1.5 GeV/c 2 . 

The fraction of jets containing heavy flavor hadrons from gluon splitting predicted by 
the Monte Carlo generators has been tuned using generic-jet data. As a result, the fraction 
of g — > bb calculated by the generators is increased by the factor 1.40 ± 0.19 and the fraction 
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of g — > cc by the factor 1.35 ± 0.36. These factors are of the same size as those measured 
by the SLC and LEP experiments for the rate of g — > bb and g —>■ cc in Z decays pi, and 



are within the estimated theoretical uncertainties |25 |. 

We use the CLEO Monte Carlo generator, QQ, to model the decay of b and c-hadrons |26| . 
All particles produced in the final state by the herwig (or pythia) + QQ generator package 
are decayed and interacted with the CDF-detector simulation (called QFL). The detector 
response is based upon parameterizations and simple models which depend on the particle 
kinematics. After the simulation of the CDF detector, the Monte Carlo events are treated as 
if they were real data. Ref. describes the calibration of the detector simulation, including 
tagging efficiencies, using several independent data samples. 



V. COMPARISON OF MEASURED AND PREDICTED RATES OF W+ > 1 JET 

EVENTS WITH HEAVY FLAVOR TAGS 

In this study, we compare the observed numbers of tagged W+ jet events as a function 
of the jet multiplicity to the SM prediction which uses the NLO calculation of the ti cross 
section. The various contributions to W+ jet events are discussed in subsection A, and the 
results of the comparisons are summarized in subsection B. 



A. Predicted contributions to the W+ jet event sample 

A detailed study of the non-tt contributions to the W + jet events was made in Ref. [[7|. 
These studies are reviewed here, along with the ti contribution derived using the theoretical 
prediction. 

The small number of events contributed by non- W sources, including bb production, is 
estimated using the data. The number of non- W events in the signal region (lepton / < 0.1 
and Fj t > 20 GeV) is predicted by multiplying the number of events with / < 0.1 and Et < 
10 GeV by the ratio R of events with / > 0.2 and Et > 20 GeV to events with I > 0.2 and 
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% T < 10 GeV. The number of tagged non- W events is predicted by multiplying the number 
of tagged events with I < 0.1 and JZi T < 10 GeV by the same ratio R. 

The number of Z + jet events in which one lepton from the Z decay is not identified 
(unidentified- Z) is calculated using the pythia generator. The simulated sample is normal- 
ized to the number of Z — > ii decays observed in the data for each jet bin. Unidentified- Z 
+ jet events can be tagged either because a jet is produced by a r originating from Z — > rf 
decays or because a jet contains heavy flavor. The number of tagged Z — > rf events is 
estimated using the pythia simulation. The number of tags contributed by unidentified- Z 
+ jet events with heavy flavor is estimated with a combination of the pythia, herwig and 
VECBOS generators. 

The contribution of diboson production before and after tagging is calculated using the 
pythia generator. The values of the diboson production cross sections [<Jww — 9.5 ± 0.7 



pb, <Jwz = 2.60 ± 0.34 pb and ozz= 1-0 ± 0.2 pb] are taken from Ref. J27j . 

The contribution from single top production before and after tagging is estimated using 
pythia to model the process pp — > tb via a virtual s-channel W and herwig to model the 
process pp — > tb via a virtual t-channel W. The production cross sections [0.74 ± 0.05 pb 
and 1.5 ± 0.4 pb for the s and i-channel, respectively] are derived using the NLO calculation 



of Ref. 28 



The ti contribution is calculated using the pythia generator. We use a t t — 5.1 pb 
with a 15% uncertainty. This number is the average of several NLO calculations of the ti 
production cross section ||. 

The direct production of W+ jets with heavy flavor is estimated using a combination 
of data and simulation. Since the leading-order matrix element calculation has a 40% un- 
certainty [pm , we first evaluate in each jet bin the number of events due to W+ jet direct 
production as the difference between the data and the sum of all processes listed above, 
including ti production, before tagging. We then use the herwig and VECBOS generators, 
calibrated with generic jet data as discussed in Section IV B, to estimate the fraction of 
W+ jet events which contain cc or bb pairs and their tag contribution. The fraction of Wc 



events and their tag contribution is determined using herwig. 

The number of events in which a jet without heavy flavor (h.f.) is tagged because of 
detector effects (mistags) is estimated using a parametrization of the mistag probability (as 
a function of the jet transverse energy and track multiplicity), which has been derived from 
generic jet data. 

B. Comparison with a SM prediction using the theoretical estimate of a t i 

The composition of the W+ jet event candidates before heavy flavor tagging is summa- 
rized in Table |. As previously discussed in Section IV A, the heavy flavor content of the 
W+ jet sample is enriched by searching jets for a displaced secondary vertex (SECVTX 
tag) or an identified lepton (SLT tag). 

The composition of the W+ jet events with SECVTX tags is shown in Table [TJ and 
those with SLT tags in Table [ITT]. The numbers of observed events with one (ST) or two 
(DT) jets tagged by the SECVTX or SLT algorithms are compared to predictions for each 
value of the jet multiplicity. 

There is good agreement between the observed and predicted numbers of tagged events 
for the four jet multiplicity bins. The probability that the observed numbers of events 
with at least one SECVTX tag are consistent with the predictions in all four jet bins is 
80%. The probability |30| that the observed number of events with at least one SLT tag are 
consistent with the predictions in all four jet bins is 56%. 

In the next section we perform a more detailed study of heavy flavor content of the W+ 
jet sample by selecting events with jets containing both a displaced vertex and a soft lepton. 
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TABLE I. Estimated composition of the W+ > 1 jet sample before tagging. 
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21.3 ± 5.9 
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1027.7 ± 31.1 
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19.9 ± 6.1 


Wc 
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86.8 ± 26.1 


11.2 ± 3.4 


1.9 ± 0.7 


Wcc 


173.1 ± 46.2 


61.9 ± 13.6 


11.4 ± 2.6 


2.3 ± 0.9 


Wbb 


69.0 ± 9.5 


29.7 ± 5.1 


5.7 ± 1.1 


1.5 ± 0.5 
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TABLE II. Summary of observed and predicted number of W events with one (ST) and two 



(DT) SECVTX tags. 


Source 


W + ljet 


W + 2 jet 


W + 3 jet 


W+ > 4jet 


Mistags 


10.82 ± 1.08 


3.80 ± 0.38 


0.99 ± 0.10 


0.35± 0.04 


Non- W 


8.18 ± 0.78 


1.49 ± 0.47 


0.76 ± 0.38 


0.31 ± 0.16 


WW, WZ, ZZ 


0.52 ± 0.14 


1.38 ± 0.28 


0.40 ± 0.13 


0.00 ± 0.00 


Single top 


1.36 ± 0.35 


2.38 ± 0.54 


0.63 ±0.14 


0.14 ± 0.03 


Wc 


16.89 ± 5.38 


3.94 ± 1.30 


0.51 ± 0.17 


0.09 ± 0.04 


Wcc (ST) 


7.89 ± 2.17 


3.54 ± 0.88 


0.77 ± 0.25 


0.16 ± 0.07 


Wcc (DT) 




0.06 ± 0.04 


0.00 ± 0.00 


0.00 ± 0.00 


Wbb (ST) 


17.00 ± 2.41 


8.35 ± 1.74 


1.62 ± 0.40 


0.41 ± 0.14 


Wbb (DT) 




1.51 ± 0.52 


0.31 ± 0.13 


0.07 ± 0.03 


Z — ► TT 


0.96 ± 0.30 


0.70 ± 0.25 


0.17 ± 0.12 


0.00 ± 0.00 


Zc 


0.14 ± 0.04 


0.03 ± 0.01 


0.01 ± 0.00 


0.00 ± 0.00 


Zee (ST) 


0.22 ± 0.06 


0.10 ± 0.03 


0.04 ± 0.02 


0.00 ± 0.00 


Zee (DT) 




0.00 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


Zbb (ST) 


0.93 ± 0.14 


0.46 ± 0.12 
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3.96 ± 1.03 


SM prediction (ST) 


65.44 ± 6.45 


29.61 ± 2.66 


12.87 ± 1.89 


8.92 ± 1.95 


SM prediction (DT) 




2.41 ± 0.56 


3.23 ± 0.76 


4.03 ± 1.03 


Data (ST) 


66 


35 


10 


11 


Data (DT) 




5 


6 


2 
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TABLE III. Summary of observed and predicted number of W events with one (ST) and two 



(DT) SLT tags. 


Source 


W + ljet 


W + 2 jet 


W + 3 jet 


W+ > 4 jet 


Mistags 


101.92 ± 10.19 


30.90 ± 3.09 


7.34 ± 0.73 


3.01 ± 0.30 


Non-W 


8.96 ± 0.84 


2.09 ± 0.56 


0.38 ± 0.27 


0.16 ± 0.11 


WW, WZ, ZZ 


0.50 ± 0.16 


0.88 ± 0.22 


0.10 ± 0.05 


0.00 ± 0.00 


Single top 


0.38 ± 0.10 


0.67 ± 0.15 


0.18 ± 0.05 


0.05 ± 0.01 


Wc 


13.12 ± 4.27 


4.29 ± 1.46 


0.73 ± 0.32 


0.13 ± 0.06 


Wcc (ST) 


6.41 ± 1.89 


2.70 ± 0.67 


0.69 ± 0.22 


0.14 ± 0.06 


Wcc (DT) 




0.02 ± 0.02 


0.00 ± 0.00 


0.00 ± 0.00 


Wbb (ST) 


5.31 ± 0.96 


2.86 ± 0.67 


0.47 ± 0.14 


0.12 ± 0.05 


Wbb (DT) 




0.09 ± 0.05 


0.01 ± 0.01 


0.00 ± 0.00 


Z — ► TT 


0.43 ± 0.20 


0.09 ± 0.09 


0.09 ± 0.09 


0.00 ± 0.00 


Zc 


0.11 ± 0.04 


0.04 ± 0.01 


0.01 ± 0.01 


0.00 ± 0.00 


Zee (ST) 


0.17 ± 0.05 


0.08 ± 0.02 


0.03 ± 0.01 


0.00 ± 0.00 


Zee (DT) 




0.00 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


Zbb (ST) 


0.29 ± 0.06 


0.16 ± 0.04 


0.05 ± 0.02 


0.01 ± 0.01 


Zbb (DT) 




0.00 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


tt (ST) 


0.14 ± 0.06 


1.35 ± 0.61 


2.85 ± 1.30 


3.36 ± 1.53 


tt [L> 1 ) 
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U.U4 ± U.Uz 
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SM prediction (ST) 
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12.91 ± 1.57 


6.98 ± 1.57 


SM prediction (DT) 




0.14 ± 0.06 


0.14 ± 0.06 


0.18 ± 0.08 


Data (ST) 
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56 


17 
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Data (DT) 
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VI. COMPARISON OF MEASURED AND PREDICTED RATES OF W+ JET 
EVENTS WITH BOTH A SECVTX AND SLT HEAVY FLAVOR TAG 



We begin this study by selecting W+ jet events with both SECVTX and SLT tags. In 
Table [IV] the predicted and observed W+ jet events with a SLT tag are split into samples 
without (top part of Table |TV| ) and with (bottom part of Table |TV|) SECVTX tags. There 
is good agreement between data and predictions for the W+ jet events with a SLT tag and 
no SECVTX tag, where a large fraction of the events have fake SLT tags in jets without 
heavy flavor. In contrast, the numbers of events with both SECVTX and SLT tags, which are 
mostly contributed by real heavy flavor, are not well predicted by the simulation. Therefore, 
we check if the rate of SLT tags in jets tagged by SECVTX (superjets) is consistent with 
the expected production and decay of hadrons with heavy flavor. 

After tagging with SECVTX, we estimate that approximately 70% of the W+ jet sample 
contains 6-jets and 20% contains c-jets (see Table pi]). On average, 20% of the b and c-hadron 
decays produce a lepton (e or /i). Only 50% of the leptons resulting from a 6-hadron satisfy 
the 2 GeV/c transverse momentum requirement of the soft lepton tag (this fraction is slightly 
smaller for c-hadron decays). In addition, the SLT tagger is approximately 90% efficient in 
identifying muons and 50% efficient in identifying electrons. Altogether, we then expect that 
about 7% of the jets tagged by SECVTX will contain an additional SLT tag if the heavy 
flavor composition of W+ jet events is correctly understood. 

The observed numbers of events with a superjet are compared to the SM prediction 
in Table |V|. The information in Table [V] is similar to that presented in Table [TV], except 



that two events listed in Table [TV] have the SLT and SECVTX tags in different jets. The 
probability |3(| that the observed numbers of events with at least one superjet are consistent 
with the prediction in all four jet bins is 0.4%. This low probability value is mostly driven by 
an excess in the W+ 2,3 jet bins where 13 events are observed^ and 4.4 ± 0.6 are expected 



2 The 13 events include tt candidates and four of these events are included in the sample used to 
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from SM sources. The a posteriori probability of observing no less than 13 events is 0.1%. 
The probability for observing this excess of W+ 2,3 jet events with a superjet does not take 
into account the number of comparisons made in our studies in various jet-multiplicity bins 
and using different tagging algorithms. It is not possible to quantify precisely the effect of 
this "trial factor" . We have carried out several statistical tests using different combinations 
of the observed and predicted numbers of single and double tags reported in Tables |T| 
through [V|. These combinations always include the observed numbers of supertags. We 
have used both a likelihood method |K| an d other statistical techniques, which combine the 
probabilities of observing a number of tagged events at least as large as the data. These 
studies yield probabilities in the range of one to several percent. 

The cause of the excess of W+ 2,3 jet events with supertags could be a discrepancy 
in the correlation between the SLT and SECVTX efficiencies in the data and simulation. 
These simulated efficiencies have been tuned separately using the data and, in principle, the 
SLT tagging efficiency in jets already tagged by SECVTX could be higher in the data than 
in the simulation. We have checked this using generic-jet data (see Appendix A) and we 
conclude that the excess of W+ 2,3 jet events with a supertag cannot be explained by this 
type of simulation deficiency. 



measure the top quark mass (T^](see also Appendix B). 
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TABLE IV. Summary of observed and predicted number of W events with a soft lepton tag. 
The data sample is split in events with and without SECVTX tags. 



Source W + ljet W + 2 jet W + 3 jet W+ > 4 jet 

Events without SECVTX tags 



Data 


9388 


1330 


182 
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SLT mistags in 
























W+ jet without h.f. 


93.31 ± 


9.33 


24.81 


± 


2.48 


4.74 


± 


0.47 


1.26 


± 


0.13 


Non- W 


8.39 ± 


0.67 


1.67 


± 


0.44 


0.31 


± 


0.22 


0.13 


± 


0.09 


WW, WZ, ZZ 


0.83 ± 


0.15 


1.58 


± 


0.21 


0.31 


± 


0.04 


0.05 


± 


0.00 


Single top 


0.27 ± 


0.06 


0.46 


± 


0.09 


0.13 


± 


0.03 


0.03 


± 


0.01 


Wc 


16.97 ± 


4.08 


5.99 


± 


1.40 


1.10 


± 


0.30 


0.22 


± 


0.06 


Wcc 


7.99 ± 


1.81 


3.78 


± 


0.51 


1.02 


± 


0.39 


0.25 


± 


0.12 


Wbb 


4.47 ± 


0.68 


2.26 


± 


0.43 


0.31 


± 


0.07 


0.10 


± 


0.03 


Z — ► TT 


0.83 ± 


0.20 


0.40 


± 


0.09 


0.15 


± 


0.09 


0.02 


± 


0.00 


Zc 


0.14 ± 


0.03 


0.05 


± 


0.01 


0.02 


± 


0.01 


0.00 


± 


0.00 


Zee 


0.22 ± 


0.05 


0.11 


± 


0.03 


0.05 


± 


0.02 


0.01 


± 


0.00 


Zbb 


0.23 ± 


0.04 


0.11 


± 


0.03 


0.03 


± 


0.01 


0.00 


± 


0.00 


ti 


0.11 ± 


0.05 


0.85 


± 


0.31 


1.90 


± 


0.65 


2.15 


± 


0.69 


SM prediction 


133.75 ± 


10.38 


42.06 


± 


2.99 


10.06 


± 


0.98 


4.22 


± 


0.72 


Data with SLT tags 


145 




47 
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Events with SECVTX tags 
Data 66 40 16 13 

SECVTX mistags in 



events with SLT tags 


0.28 


± 


0.03 


0.20 


± 


0.02 


0.16 


± 


0.02 


0.05 


± 


0.01 


Non-W 


0.57 


± 


0.05 


0.42 


± 


0.11 


0.08 


± 


0.05 


0.03 


± 


0.02 


WW, WZ,ZZ 


0.02 


± 


0.02 


0.16 


± 


0.03 


0.03 


± 


0.01 


0.00 


± 


0.00 


Single top 


0.12 


± 


0.04 


0.32 


± 


0.06 


0.09 


± 


0.02 


0.02 


± 


0.01 


Wc 


0.88 


± 


0.29 


0.38 


± 


0.12 


0.17 


± 


0.02 


0.02 


± 


0.00 


Wcc 


0.41 


± 


0.13 


0.41 


± 


0.13 


0.14 


± 


0.05 


0.03 


± 


0.01 


Wbb 


1.58 


± 


0.33 


1.40 


± 


0.30 


0.40 


± 


0.08 


0.11 


± 


0.02 


Z — ► TT 


0.00 


± 


0.00 


0.00 


± 


0.00 


0.00 


± 


0.00 


0.00 


± 


0.00 


Zc 


0.01 


± 


0.00 


0.00 


± 


0.00 


0.00 


± 


0.00 


0.00 


± 


0.00 


Zee 


0.01 


± 


0.00 


0.01 


± 


0.00 


0.01 


± 


0.00 


0.00 


± 


0.00 


Zbb 


0.08 


± 


0.02 


0.06 


± 


0.02 


0.03 


± 


0.01 


0.01 


± 


0.00 


ti 


0.04 


± 


0.02 


0.78 


± 


0.30 


1.88 


± 


0.65 


2.65 


± 


0.85 


SM prediction 


4.00 


± 


0.47 


4.15 


± 


0.50 


2.99 


± 


0.66 


2.93 


± 


0.85 


Data with SECVTX and SLT tags 




1 






9 






5 






3 
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TABLE V. Observed and predicted number of W+ jet events with a supertag. The subsample 


of events with an additional SECVTX tag (DT) 


is also listed. 






Source 


W + ljet 


W + 2 jet 


W + 3 jet 


W+ > 4 jet 


SECVTX mistags m 










events with SLT tags 


0.28 ± 0.03 


0.09 ± 0.01 


0.07 ± 0.01 


0.02 ± 0.00 


Non- W 


0.57 ± 0.05 


0.13 ± 0.03 


0.00 ± 0.00 


0.00 ± 0.00 


T T T T I T ITT r S r~7 F~T 

WW, wz, zz 


0.02 ± 0.02 


0.13 ± 0.06 


0.01 ± 0.01 


0.00 ± 0.00 


Single top 


0.12 ± 0.04 


0.24 ± 0.05 


0.07 ± 0.02 


0.02 ± 0.00 


Wc 


0.88 ± 0.29 


0.24 ± 0.14 


0.14 ± 0.10 


0.00 ± 0.00 


Wcc 


0.41 ± 0.13 


0.25 ± 0.09 


0.13 ± 0.06 


0.00 ± 0.00 


Wbb 


1.58 ± 0.33 


1.07 ± 0.26 


0.19 ± 0.09 


0.01 ± 0.00 


Z — ► TT 


0.00 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


Zc 


0.01 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


0.00 ± 0.00 


Zee 


0.01 ± 0.00 


0.01 ± 0.00 


0.01 ± 0.00 


0.00 ± 0.00 


Zj 00 


n no -L- n no 
U.Uo ± U.Uz 


0.05 ± 0.02 


0.02 ± 0.01 


0.00 ± 0.00 


ti 


0.04 ± 0.02 


0.48 ± 0.19 


1.08 ±0.40 


1.42 ± 0.49 


SM prediction (supertags) 


4.00 ± 0.50 


R(\ 4- n /1 1 

z.oy ± u.4i 


1. 1 1 ± U.4U 


i at _i_ n ^ i 

1.4/ ± U.Ol 


SM prediction (DT) 




0.26 ± 0.06 


0.36 ± 0.08 


0.50 ± 0.13 


Data (supertags) 


1 


8 


5 


2 


Data (DT) 




2 


3 
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VII. PROPERTIES OF THE EVENTS WITH A SUPERJET 



Having observed an excess of W+ 2,3 jet events with a supertag, we next compare 
the kinematics of these events with the SM simulation. We check the simulation using a 
complementary W+ 2,3 jet sample of data. This sample is described in subsection A. In 
subsection B we compare the heavy flavor content of the additional jets in events with a 
superjet and in the complementary sample. In subsections C and D we compare several 
kinematical distributions of these events to the simulation. 

A. Complementary data sample 

We check our simulation by studying a larger data sample consisting of W+ 2,3 jet events 
with a SECVTX tag, but no supertags. The number of observed and predicted events are 
compared in Table |V| (43 W+ 2,3 jet events are observed, in agreement with the SM 
prediction of 43.6 ± 3.3). We have chosen this sample because, as shown by the comparison 
of Table |VT| with Table [V], its composition is quite similar to W+ jet events with a supertag^. 
In order to have a complementary sample of data with the same kinematical acceptance of 
the events with a supertag, we also require that at least one of the jets tagged by SECVTX 
contains a soft lepton candidate track. After this additional requirement this sample of W+ 
2,3 jet events consists of 42 events (the SM prediction is 41.2 ±3.1 events). We note that, 
while closely related, this event sample has still a few features which are different from the 
superjet sample. For instance, most of the superjets are expected to be produced by heavy 
flavor semileptonic decays, in which the corresponding neutrino escapes detection, while 
in the complementary sample SECVTX tagged jets are predominantly produced by purely 
hadronic decays of heavy flavors. However, according to the simulation, a large fraction of 



3 W+ 2,3 jet events with a SLT tag and no supertags are another larger statistics data set, however 
the heavy flavor composition is quite different from that expected for events with a superjet. 
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heavy flavor semileptonic decays is not identified by the SLT algorithm and is also included 
in the complementary sample. All such effects are in principle described by the simulation. 



B. Heavy flavor content of additional jets 

The heavy flavor content of the second and third jet in the events can be inferred from 
the rate of additional SECVTX tags. Tables [V] and |VI] show the number of observed and 



predicted events with an additional jet tagged by SECVTX in superjet events and in the 
complementary sample. In the latter data sample, in which according to the simulation in 
Table |VI]most of the events contain a second jet with b flavor, there are 6 W+ 2,3 jet events 
with a double SECVTX tag, in agreement with the expectation of 5.02 ± 0.84 events. 

Of the 13 W+ 2,3 jet events with a superjet 5 contain an additional SECVTX tag. If 
the 13 events are a fluctuation of SM processes, we expect to find 1.8 ± 0.3 events with a 
double tagf]. The probability of observing 5 or more W+ 2,3 jet events with double tags 
is 4.1%. Given the high probability of finding an additional SECVTX tag, we apply 6-jet 
specific energy corrections to the additional jets in the event. These jets are later referred 
to as "5-jets". 



4 The prediction is 0.62 ± 0.10 events with a double tag in 4.4 events with a superjet. 
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TABLE VI. Observed and predicted number of W+ jet events tagged by SECVTX after remov- 



ing events with a supertag. The subsample of events with an additional SECVTX tag (DT) is also 


listed. 










Source 


W + ljet 


W + 2jet 


W + 3jet 


W+ > 4 jet 


Mistags 


10.52 ± 1.00 


3.72 ± 0.34 


0.93 ± 0.09 


0.34 ± 0.04 


Non- W 


7.61 ± 0.06 


1.36 ± 0.04 


0.76 ± 0.03 


0.31 ± 0.03 


WW, WZ,ZZ 


0.50 ± 0.14 


1.25 ± 0.25 


0.40 ± 0.13 


0.00 ± 0.00 


Single top 


1.24 ± 0.31 


2.15 ± 0.49 


0.56 ± 0.13 


0.12 ± 0.03 


Wc 


16.02 ± 5.13 


3.70 ± 1.29 


0.37 ± 0.13 


0.09 ± 0.03 


Wcc 


7.48 ± 2.08 


3.35 ± 0.86 


0.64 ± 0.22 


0.16 ± 0.06 


Wbb 


15.42 ± 2.21 


8.80 ± 1.63 


1.74 ± 0.40 


0.47 ± 0.13 


Z — ► TT 


0.96 ± 0.30 


0.70 ± 0.25 


0.17 ± 0.12 


0.00 ± 0.00 


Zc 


0.13 ± 0.04 


0.03 ± 0.01 


0.01 ± 0.00 


0.00 ± 0.00 


Zee 


0.21 ± 0.06 


0.10 ± 0.03 


0.03 ± 0.02 


0.00 ± 0.00 


Zbb 


0.85 ± 0.13 


0.48 ± 0.11 


0.19 ± 0.06 


0.02 ± 0.02 


n 


u.ou ± U.1D 


q 4- i nn 

o.OZ ± l.UU 


O.DD ± Z.oo 


n net 4_ o /in 


SM prediction 


61.44 ± 6.09 


29.26 ± 2.58 


14.39 ± 2.34 


11.48 ± 2.37 


SM prediction (DT) 




2.15 ± 0.50 


2.87 ± 0.67 


3.53 ± 0.90 


Data 


65 


32 


11 


11 


Data (DT) 




3 


3 


2 
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C. Method for testing if the data are consistent with the SM simulation 

In the next subsection we study distributions of several simple kinematic variables Xi for 
the 13 events with a superjet and the complementary sample of 42 events. Each data dis- 
tribution is compared with the sum of the 12 SM contributions, SMj(xi), listed in Tables 
and |V| using a Kolmogorov-Smirnov (K-S) test |3"T| , |3"2"|| . Using the cumulative distribution 
functions F(xi) and H(xi) of the two distributions to be compared, the K-S distance is 
defined as S = max (F(xi) — H(xi)) + max (H(xi) — F(xi)). This is the Kuiper's definition 
of the K-S distance pj. 

For each variable Xj, the probability distribution of the K-S distance, Wi(6), is determined 
with Monte Carlo pseudo-experiments. In each experiment, we randomly generate parent 

12 

-Tr 

distributions } y ^-SMj(xj) for two and three jet events independently. The integral X,- = 

/»,.*'—. 

and, in each pseudo-experiment, the value XJ accounts for Poisson fluctuations and Gaussian 
uncertainties in Ij. We use these parent distributions to randomly generate the same number 
of Xj-values as in the data, but we evaluate the K-S distance of the Xi distribution in 

12 

each pseudo-experiment with respect to the parent distribution y^ j SM i (x). Using the so 

3=1 

derived W{(S) distribution, we define the probability that the Xi distribution of the data 
is consistent with the SM simulation as Pi = / Wi(S)dS, where 6° is the K-S distance of 
the data. 



D. Comparison of kinematical distributions in the data with the SM simulation 



We test if the events with a superjet are consistent with the SM prediction by comparing 

d 2 a 

the production cross sections — of each object in the final state. In all SM processes 

dp-rdrj 

contributing to these events, these differential cross sections approximately factorize, and 
d 2 a 

f(pr) ' giv)- Therefore we compare data and SM simulation in the following 



dprdf] 

kinematical variables: the transverse energy and pseudo-rapidity distributions of the primary 
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leptons, the superjets, the additional jets in the event (referred to as 6-jets), and the neutral 
object producing the missing energy in the eventf]. The kinematics of the neutral object 
producing the missing energy cannot be measured directly. However, correlated quantities 
are the transverse energy and the rapidity of the recoiling system I + b + suj composed of 
the primary lepton (/), the superjet (suj) and each additional jet (6) in the event. Since the 
total transverse momentum of the events is conserved, in W+ 2 jet events the transverse 
energy 

E l+b+suj of the sygtem l + b + 

suj is a measure of the missing transverse energy. In 
the rest frame of the initial state partons producing W+ 2 jet events, the rapidities of the 
system I + b + suj and of the object producing the missing energy are also correlated. This 
correlation is however smeared by the unknown Lorentz boost of the initial parton system. 
For uniformity, in W+ 3 jet events we use the same variables with two entries per event 
(corresponding to the two possible choices for the 6-jet). 

We finally test the distribution of the azimuthal angle 8cj) l ' bJrSU ^ between the primary 
lepton and the system b + suj composed by the superjet and each additional 6-jet with the 
purpose of checking if the events are consistent with the simulated production and decay of 
W bosons. The W transverse mass can be described with the variables E l T and JZl T , which 
are already used, and the azimuthal angle between the primary lepton and the W direction. 
Since the total transverse momentum of the events is conserved, in W+ 2 jet events this 
azimuthal angle can be inferred from the supplementary angle 8(p l,b+su ^. For uniformity, in 
W+ 3 jet events we use the same variable with two entries per event. 

This minimal set of 9 variables is sufficient to describe the kinematics of the final state 
with relatively modest correlations. The observed and predicted distributions of these kine- 
matical variables are compared in Figures [I] to |9[ For each comparison, we show the prob- 
ability P that the data are consistent with the simulation. Table |V11| summarizes the 



5 Jet energies are corrected using the full set of correction functions developed to measure the top 
mass 
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probabilities of these comparisons. The SM simulation models correctly the complemen- 
tary sample of data, but has a systematically low probability of being consistent with the 
kinematical distributions of the events with a superjet. 

In addition, one notices that the rapidity distributions of the primary lepton and the 
jets in the 13 events (Figures |], |5], and are not symmetric around rj = and are 
more populated at positive rapidities. These observations led to additional investigations 
of the characteristics of the 13 events exploring the possibility that some detector effects 
were not properly modeled by the simulation. These studies have not revealed any anomaly 
which could be taken as an indication of detector problems. In particular, asymmetries 
due to detector problems are not visible in the complementary sample nor in the larger 
statistics sample of generic-jet data. However, as shown in Figure pi], we discovered that the 
primary vertex of these events has an asymmetric ^-distribution (z is the axis along the beam 
line). Again, such an asymmetry is not observed in any of the large statistics data samples 
available. The binomial probability of observing an equal or larger asymmetry due to a 
statistical fluctuation in the distribution of the event vertex is 1.1%. Similar probabilities 
for the asymmetry in several rapidity distributions are in the range between 1.5 to 10%. 
Since we know of no physics process that would produce such asymmetries, it is possible 
that an obscure detector problem, not seen in other samples, is responsible; or it may be 
that these asymmetries are due to a low probability statistical fluctuation. 
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TABLE VII. Results of the K-S comparison between data and simulation. For each variable we 
list the observed K-S distance 5° and the probability P of making an observation with a distance 
no smaller than 5°. 



Events with a superjet Complementary sample 



Variable 


6° 


P{%) 


6° 


P(%) 


Eip 


0.47 


2.6 


0.14 


70.9 


rf 


0.54 


0.10 


0.12 


72.7 


E s T uj 


0.38 


11.1 


0.15 


43.0 




0.36 


15.2 


0.13 


73.4 


E\ 


0.36 


6.7 


0.18 


8.6 


r] b 


0.38 


6.8 


0.11 


80.0 


■pd+b+suj 


0.39 


2.5 


0.17 


18.8 


yl+b+suj 


0.31 


13.8 


0.19 


7.8 




0.43 


1.0 


0.12 


77.9 


Zvrtx 


0.48 


1.7 


0.16 


50.5 
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FIG. 1. Distributions of the transverse energy of the primary lepton for the data (•) are com- 
pared to the SM prediction (shaded histograms). The dotted histograms show the SM simulation 
normalized to the data. The probability distribution of the K-S distance 5 is calculated with 
Monte Carlo pseudo-experiments (see text). The vertical line indicates the observed distance 6° 
between the cumulative distributions of the data and the simulation. The integral of the shaded 
area represents the probability P of measuring a K-S distance no smaller than 6 . 
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FIG. 2. Distribution of the pseudo-rapidity of the primary lepton in events with a superjet and 
in the complementary sample. 
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FIG. 3. Distribution of the transverse energy of the superjet in events with a superjet and in 
the complementary sample. 
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FIG. 4. Distribution of the pseudo-rapidity of the superjet in events with a superjet and in the 
complementary sample. 
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FIG. 5. Distribution of the transverse energy of all fe-jets in events with a superjet and in the 
complementary sample. 
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FIG. 6. Distribution of the pseudo-rapidity of all 6-jets in events with a superjet and in 
complementary sample. 
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FIG. 7. Distribution of the transverse energy of the system £+superjet+6-jet in events with a 
superjet and in the complementary sample. 
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FIG. 8. Distribution of the rapidity of the system Z+superjet+6-jet in events with a superjet 
and in the complementary sample. 
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FIG. 9. Distribution of the azimuthal angle between the primary lepton and the superjet+6-jet 
system in events with a superjet and in the complementary sample. 



42 




-20 20 

Z rf (cm) 

vrtx v 7 

FIG. 10. Distribution of the event-vertex position along the beam line (z-axis) in events with a 
superjet and in the complementary sample. 
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The set of 9 kinematic variables used to compare data and simulation is not the only 
possible choice. We also looked at 9 complementary variables, and Table |V111| shows the 
result of the K-S test for this set of kinematic distributions: JZl T , the corrected transverse 
missing energy; Mjf , the W transverse mass calculated using the primary lepton and $ T ; 
M b+mj , y b+su \ and £^ +snj , the invariant mass, rapidity, and transverse energy of the system 
b + suj respectively; M l+b+su i , the invariant mass of the system I + b + suj; 89 b ' su i and 
S(f> b ' SUJ , the angle and the azimuthal angle between the superjet and the 6-jets, respectively; 
and 5Q l ' b+su i ; the angle between the primary lepton and the system b + suj. The simulation 
correctly models these distributions for the complementary sample, while the probabilities 
for events with a superjet are systematically lower. However, the disagreement between 
events with a superjet and their simulation is much reduced for this second set of variables. 
The probability distribution of the K-S comparisons for the 18 kinematic distributions is 
shown in Figure [11]. 

TABLE VIII. K-S comparison of additional kinematical variables. For each variable we list 
the observed K-S distance 8° and the probability P of making an observation with a distance no 
smaller than 5°. 



Events with a superjet 



Complementary sample 



Variable 


6° 


P (%) 


5° 


P(%) 


Ft 


0.3 1 


27.1 


0.14 


57.1 




0.36 


13.1 


0.16 


38.2 


M b+su i 


0.36 


4.0 


0.12 


58.9 


yb+suj 


0.35 


7.1 


0.14 


34.9 


E b+ suj 


0.28 


24.0 


0.10 


60.1 


j^l+b+suj 


0.31 


21.0 


0.15 


33.6 




0.26 


30.1 


0.15 


41.1 


S(j)b,suj 


0.31 


15.3 


0.10 


83.8 


fiQl,b+suj 


0.25 


37.3 


0.16 


35.7 
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FIG. 11. Distribution of the probabilities P that the 13 events with a superjet (a) and the 
complementary sample (b) are consistent with the SM prediction. The distribution (a) has a mean 
of 0.13 and a RMS of 0.11; the distribution (b) has a mean of 0.50 and a RMS of 0.24. 



As indicated by the figure, the probabilities of the complementary sample appear to be 
flatly distributed, as expected for a set of distributions consistent with the simulation. In 
contrast, the probabilities of the superjet events cluster at low values. This indicates the 
difficulty of our simulation to describe the kinematics of events with a superjet. Given the a 
posteriori selection of the 9 kinematic variables, the combined statistical significance of the 
observed discrepancies cannot be unequivocally quantified. A thorough discussion of this 
issue is beyond the goal of this paper, which is meant to present the basic measurements. We 
leave additional studies of these events and their possible interpretation to other publications. 
The characteristics of these events are listed in Appendix B. 
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VIII. CHECK OF THE ISOLATION AND LIFETIME OF THE PRIMARY AND 

SOFT LEPTONS 



The kinematics of the primary leptons in events with a superjet is poorly described by 
the SM simulation, in which they are mostly contributed from W decays. Therefore, we 
cross-check that the excess of events with a superjet is not due to a misestimate of the 
number of non- W events. According to the SM prediction, the small background of tagged 
non- W events is due to semileptonic decays in bb and cc events. In such a case, the primary 
leptons are not isolated and have large impact parameters because of the long b and c quark 



lifetime. Figure [12] shows that primary leptons in the 13 events with a superjet are at least as 
well isolated as primary leptons in the complementary sample. Distributions of the signed 



impact parameter significance of the primary lepton track are also shown in Figure [12. 
Tracks from long-lived decays usually have large (> 3) impact parameter significance. The 
primary leptons in the 13 events are consistent with being prompt. One also notes that 
in the complementary sample two events have primary leptons with large positive impact 
parameter; this is consistent with our estimate of 2.10 ± 0.05 non-VF events (mostly from 
6-decays) . 

Based on the SM expectation, the average transverse momenta of primary and soft 
leptons are expected to differ by an order of magnitude (they are selected with a 20 and 2 
GeV/c transverse momentum requirement, respectively). However, in the data the average 
transverse momenta are 35 and 13 GeV/c, respectively. Since the W+ > 1 jet sample has 
been selected by removing all events containing a second lepton candidate with isolation / < 
0.15 and transverse momentum p T > 10 GeV/c, the superjets could be due to dilepton events 
which are not removed because the second lepton happens to be merged with a jet and is not 
isolated. We have removed only 16 dilepton candidate events tagged by SECVTX from the 
W+ 2,3 jet sample. From the simulation we expect that less than 0.5 events will have the 
second lepton randomly distributed in a cone of radius 0.4 around the axis of the jet tagged 
by SECVTX. Figure [L^ shows that soft leptons are mostly found close to the superjet axis 
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and are not uniformly distributed over the jet clustering cone of radius R = 0.4. We have 
also looked at the distribution of the signed impact parameter significance of SLT tracks. 
Figure shows that, in contrast with primary leptons, soft leptons inside a superjet are 
not prompt. As expected from the simulation of heavy flavor decays, the soft lepton track 
is part of the SECVTX tag in 8 out of 13 superjets. 



o 



in 
Oh 



O 

6 2 



15 



o 
u 

C3 



10 



° 5 
d 



FIG. 12. 

of primary 



events with superjets 



20 40 60 

d/G d 

— i i i i | i i i | i i i | 
complementary sample 



3 2 

Oh 

(U 



Oh 



o 







events with superjets 



0.025 0.05 0.075 0.1 
Isolation 




complementary sample 



20 40 60 0.025 0.05 0.075 0.1 

d / a d Isolation 

Distributions of the signed impact parameter significance (d/dd) an d of the isolation 

leptons. 



47 



CO 
M 
o 
a 

is 
o 

Oh 



o 

CO 



O 



2- 



L 



events with superjets 



CO 

O 

is 

c 

o 

-(-J 
Oh 



o 



o 



2- 



1 ti 



20 40 60 

d/a, 







events with superjets 



0.1 0.2 0.3 0.4 0.5 
'd 6R 

FIG. 13. Distributions of the signed impact parameter significance of soft lepton tracks and of 



their distance SR = \/5(j) 2 + 5r] 2 from the superjet axis. 



IX. ADDITIONAL PROPERTIES OF THE SUPERJETS 

In this section we compare other properties of the superjets to the W+ jet simulation to 
verify if, independent of the excess of soft lepton tags and the discrepancies found in Section 
VII, they are otherwise compatible with being produced by semileptonic decays of b and c 
hadrons. 

A. Lifetime 

A measure of the lifetime of the hadron producing a secondary vertex is 

L xy M svx 
pseudo— r = — - gVY , 

c p^ x 

where L xy is the projection of the transverse displacement of the secondary vertex on the 
jet-axis, M svx is the invariant mass and pfX x is the total transverse momentum of all tracks 
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associated with the secondary vertex. In this measurement, the Lorentz boost of the heavy 
flavor hadron is approximated with the Lorentz boost of the SECVTX tag. 

Pseudo-r distributions are compared in Figure [H] to the simulation based on the sam- 
ple compositions for the superjet and complementary sample. The number of simulated 
superjets is rescaled to 13 events. One notes that data and simulation have quite similar 
pseudo-r distributions. The pseudo-r calculation does not account for the neutral particles 
emitted in the heavy flavor decay. As a result a kinematic correction factor is needed to 
convert it into a lifetime measurement. In the case of beauty or charmed mesons, this factor 
is approximately 1.1. 

A measure of the lifetime independent of the Lorentz boost is provided by r ip = — — — 

7T c 

where < do > is the error-weighted average impact parameter of all tracks that form a 

SECVTX tag and have positive signed impact parameter. The distribution of the ratio 

R T = — provides a check of the kinematic correction factor. 

pseudo — r 

We first show that our simulation correctly models the correlation between the lifetime 
measured with pseudo-r and n p by using the generic-jet samples described in Appendix [A|. 



Figures [15| and [16| show that both methods yield consistent lifetime measurements in the 
data and in the simulation in which SECVTX tags are produced by b and c-hadrons. In 
this comparison, the contribution of fake tags in jets without heavy flavor is removed by 
subtracting the observed distribution of negative SECVTX tags (see Section [IV A|) . 



Figure |17| presents the R T distributions in superjet events and in the complementary 
sample. The result of the usual K-S comparisons (see Section [VII C|) between the data and 



the simulation are listed in Table |LXj and indicate overall agreement. As shown in Figure jig, 
the distributions of the invariant mass 

M svx 

are also correctly modeled by the simulation. 
The transverse momentum distribution of SECVTX tags is discussed in the next subsection. 



49 



o 
u 

CO 
p 



H 




1 2 3 4 5 6 
pseudo-T (psec) 




1 2 3 4 5 6 
pseudo-T (psec) 



o 

CD 
CO 

CM 
p 



H 




1 2 3 4 5 6 
pseudo-T (psec) 



FIG. 14. Pseudo-r distributions for superjets (a) and for tagged jets in the complementary 
sample (b) are compared to the simulation (shaded histograms). The distribution for additional 
SECVTX tagged jets in superjet events (c) is compared to simulated 6-jets. 
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FIG. 17. Distributions of the variable R T (see text) for superjets (a) and for tagged jets in the 
complementary sample (b) are compared to the simulation (shaded histograms). The distribution 
for 6-jets in superjet events (c) is compared to simulated 6-jets. 
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TABLE IX. Result of K-S comparisons between data and simulation. For each variable we list 

the observed K-S distance 8° and the probability P of making an observation with a distance no 

smaller than 5°. 

Events with a superjet Complementary sample 

Variable 6° P (%) 5° P (%) 

R T (superjets) 0.44 4.7 0.15 35.1 

R T (6-jets) 0.44 39.0 

M svx 0.20 56.9 0.10 51.4 

p^ LT 0.55 0.09 

p§Y x 0.14 47.4 
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FIG. 18. Distributions of M svx , the invariant mass of the tracks associated with a secondary 
vertex, are compared to the simulation (shaded histograms) normalized to the same number of 
events. 
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B. Transverse momentum distribution of SLT tags 



Figure 19 compares the distribution of p^ LT , the soft lepton transverse momentum, in 
the 13 superjets to the simulation based on the sample composition listed in Table 0. The 
pf iT spectrum depends on the jet transverse energy, and the superjet transverse energy- 
distribution in the data is stiffer than in the SM expectation (see Figure |3J). Therefore, we 
have corrected the transverse energy distribution of simulated superjets to make it look like 
the data. Figure [B5] shows that soft leptons in superjet events have transverse momenta 
larger than what is expected for semileptonic decays of b and c-quarks. By construction the 
complementary sample does not contain soft lepton tags. However, the total trans- 

verse momentum of all tracks forming a SECVTX tag, is a useful analogue. If the difference 
between the transverse momentum of the soft lepton tag in the data and the simulation 
were due to inadequate modeling of the hadronization process, the pjY x distribution in 
the complementary sample would also disagree with the simulation. However, Figure pO| a 
shows agreement between the complementary sample and the simulationF]. The result of the 
K-S comparison of these distributions is also listed in Table |X[ The probability that the 
Pt VX distribution in the complementary sample is produced according to the simulation is 
P = 47%. The probability that the pf LT distribution in superjets is consistent with the SM 
simulation is P = 0.1%. 



6 Since most of the SLT tracks are associated with the secondary vertex, the p^ vx distribution 
for superjets appears stiffer than in the complementary sample and in the simulation. 
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FIG. 19. The distribution of the transverse momentum of soft leptons in superjet events is 
compared to the SM expectation normalized to the same number of tags and corrected for the 
superjet Et distribution. One superjet contains two soft leptons. 
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FIG. 20. Distributions of the transverse momentum of all tracks forming a SECVTX tag in the 
complementary sample (a) and in superjets (b). 



C. Comparison of p^ and p^ distributions in generic-jet data to the simulation 

We compare superjets in generic-jet data and in the corresponding simulation to check if 
the discrepancy between the observed and predicted transverse momentum distribution of 
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soft lepton tags is due to the modeling of semileptonic decays in QQ or to the modeling of the 
hadronization in herwig. The generic-jet data and simulation are described in Appendix [A|. 
The heavy flavor content of this sample is similar to that of W+ 2,3 jet events. We normalize 
data and simulation to the same number of events and in both we search for jets which 
contain positive and negative SECVTX tags. We then search for additional soft lepton 
tags in jets tagged by SECVTX. The data and simulation contain approximately the same 
number of supertags as a result of the calibration of the SLT efficiency in the simulation (see 
Appendix A). Fake SECVTX tags are evaluated and removed using the number of observed 
negative SECVTX tags in the data and the simulation. We do not remove the contribution 
of fake SLT tags from the data but we add fake SLT tags to the simulation by weighting 
each track in a simulated jet with the same SLT fake probability normally used to evaluate 
the rate of fake tags in the data. 

In 5.5 x 10 5 generic-jet events we find 1324 events with a supertag in the data and 1342 
in the simulation. Distributions of the transverse momentum of soft lepton tags and of all 
tracks forming a SECVTX tag are shown in Figure pTJ. The agreement between data and 
simulation provides evidence that we correctly model b and c-jets. 
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FIG. 21. Distributions of the transverse momentum of soft leptons (a) and of all tracks form- 
ing the SECVTX tags (b) in superjets selected in generic-jet data and in the corresponding SM 
simulation. Data and simulation are normalized to the same number of events before tagging. 
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X. ADDITIONAL CROSS-CHECKS 



The selection criteria used in this analysis were optimized for finding the top quark |fl. 
The high-p<r inclusive lepton data set, from which we have selected the sample used in this 
study, consists of about 82,000 events with one or more jets before making requirements on 
the transverse momentum and isolation of the primary lepton and on the missing transverse 
energy. Half of these events have primary leptons which are not well isolated (J > 0.2). They 
are mostly due to multi-jet production with one jet containing a fake lepton, but also include 
a small amount of bb and cc production. The px > 20 GeV/c, / < 0.1 and JZl T > 20 GeV 
cuts reduce this data set to an almost pure W+ jet sample of about 11,000 events. In 
subsection A, we investigate the rate of superjets in the kinematic regions removed in the 
original selection of the W+ > 1 jet sample. This checks that events with a superjet are not 
the tail of a large unexpected background. In subsection B we look at the effect of removing 
the trigger requirement for primary muons and in subsection C we extend our search to 
events with a primary electron in the plug calorimeter. 

A. Dependence on $ T , and on the isolation and transverse momentum of the 

primary lepton 

There are 36,677 events with a primary lepton with p? > 20 GeV/c and I < 0.2; 615 
events have SECVTX tags (their / vs. Ex distribution is shown in Figure |22|). Using 



nominal cuts for selecting the primary lepton, we first study the rate of supertags in events 
tagged by SECVTX when Er < 20 GeV. With the exception of non- W events, which are 
the largest fraction, the relative contribution of all other SM processes does not depend on 
Ex. Since the ratios of supertags to SECVTX tags in non-VK events and in the sum of the 
remaining processes are quite similar, in this case we predict the number of supertags in 
this sample by multiplying the number of observed SECVTX tags by the predicted ratio of 
supertags to SECVTX tags for events with Et > 20 GeV. The observed number agrees with 
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the expectation as shown in Table [X|. 

In Table [XI] we compare rates of supertags in events tagged by SECVTX when the iso- 
lation of the primary lepton is large. These events are mostly contributed by bb production. 
The number of observed supertags in events with JZl T > 20 GeV is consistent with the pre- 
diction of the method used to estimate the non- W background (we multiply the number of 
SECVTX tags in events with Er > 20 GeV by the ratio of supertags to SECVTX tags in 
events with % T < 20 GeV). 

As shown in Figure [l], many primary leptons in superjet events have transverse momen- 
tum close to the threshold used to select the sample. We have checked that we are not 
observing the tail of a distribution peaking at small transverse momenta by first removing 
the 20 GeV/c transverse momentum cut on the primary lepton (the pr threshold of the L3 
trigger is about 18 GeV/c). Before tagging the size of the W+ jet sample increases by 20%. 
As shown in Table |XU| , no additional events with a supertag are found. 

We then have searched for events with a superjet in the \ow-pr inclusive lepton sample 
collected during the 1994-1995 collider run (Run IB) using a L3 trigger threshold of 8 GeV/c 
(8 of the 13 events with a superjet were collected in Run IB). Because of the lower threshold, 
the trigger rate was prescaled by a factor of 1.3. In this sample we find 7 events having a 
primary lepton with > 10 GeV/c and I < 0.1, Er > 20 GeV, and containing a superjet 
and 1 or 2 additional jets. Six of the 7 events are the same events found in the high-p-r 
inclusive lepton sample; the additional event contains a primary electron with E T = 17.7 
GeV. 



61 




20 40 60 80 100 



E T (GeV) 

FIG. 22. Distribution of primary lepton isolation vs. $ T for events containing one or more jets 
tagged by SECVTX. The primary lepton transverse momentum is larger than 20 GeV/c. 
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TABLE X. Number of tagged events as function of the jet multiplicity. The events are selected 
by requiring Ex < 20 GeV and a primary lepton with pt > 20 GeV/c and I < 0.1. The predicted 
number of supertags is based upon the observed number of SECVTX tags (see text). 



Tag type 1 jet 2 jets 3 jets > 4 jets 

SECVTX 168 21 7 6 

Supertag 12 1 

Prediction 10.2±1.3 1.2±0.2 0.5±0.2 0.5±0.2 



TABLE XL Yield of events with supertags as function of the jet multiplicity. We select primary 
leptons with px > 20 GeV/c and isolation 0.1 < I < 0.2. The prediction of supertags in events 
with Ex > 20 GeV is derived using the ratio of supertags to SECVTX tags in events with E" T < 
20 GeV. 

# T < 20 GeV 
Tag type 1 jet 2 jet 

SECVTX 220 33 

Supertag 17 4 

Er > 20 GeV 



Tag type 1 jet 2 jet 3 jet > 4 jet 

SECVTX 8 3 5 

Supertag 2 10 

Prediction 0.6 ± 0.1 0.4 ± 0.2 1.0± 0.7 



TABLE XII. Numbers of tagged W+ jet events with E" T > 20 GeV and primary leptons with 
I < 0.1 and p T < 20 GeV/c. 

Tag type W + ljet W + 2jet W + 3 jet W+ > 4 jet 

SECVTX 2 1 

Supertag 



3 jet 
10 
2 



> 4 jet 
2 
1 
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B. Removal of the trigger requirement for primary muons 



In selecting the events used in this analysis, we require that the primary lepton has fired 
the appropriate second level (L2) trigger (see Section 111 A ). The second level of the muon 
trigger requires a match between a CTC track reconstructed by a fast track processor |34 



and a track segment in the muon chambers, which fired the first level trigger 0,0]. The 
L2 trigger efficiency for primary muons is approximately 70% J7|. Based on the observed 
13 events with a superjet, we should have lost about two such events because the primary 
muon failed the muon trigger (the detector has about the same acceptance for electrons 
and muons). However, the original high-pr lepton data set contains also events triggered 



by other objects in the events. As shown in Figure [19|, 85% of the superjets contain a 
soft lepton with transverse momentum comparable or larger than the L2 trigger threshold. 
If the observed transverse momentum distribution of the soft leptons is not a statistical 
fluctuation, we could find in the original data sample one or two additional events with a 
supertag in which the primary muon failed the trigger but the event was rescued by the soft 
muon. On the other hand, according to the SM simulation, only 9.6% of the W+ jet events 
with a SLT tag contain a soft muon which passes the trigger pr-requirement. Using the 
predicted rates listed in Table [TIT], we estimate that: 31 W+ 1 jet events and 12 W+ 2,3 jet 
events with a primary muon have failed the trigger; 3 W+ 1 jet events and 1.1 W+ 2,3 jet 
events can be rescued by a soft muon. Of these events, 0.09 W+ 1 jet and 0.08 W+ 2,3 jet 
events are expected to contain a jet with a supertag. 

In the data, after removing the trigger requirement on the primary muon, we recover 
three W+ 1 jet events, none of which contains supertags. We also recover one W+ 2 jet 
and one W+ 3 jet event, both with a supertag. No extra W+ 4 jet event is found. The 
characteristics of these two events are listed in Appendix B. 



64 



C. Study of plug electrons 



As shown in Figure |2|, the pseudo-rapidity distribution of primary leptons in events with 
a superjet appears to rise at the end of the central detector acceptance (|?7| ~ 1). Motivated 
by this observation, we have searched for events with a superjet using primary electrons 
in the plug calorimeter. The pseudo-rapidity and transverse momentum distributions of 
plug electrons are shown in Figure We select W+ jet events requiring an isolated plug 
electron with E T > 20 GeV and $ T > 20 GeV. 

Table |X.111| lists rates of W+ jet events with a primary plug electron before and after 
tagging. We observe two additional W+ 2,3 jet events with a supertag, when 0.34 ± 0.04 
events are expected from known processes. The characteristics of these two additional events 
with a superjet are listed in Appendix B. 
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FIG. 23. Distributions of the transverse momentum and the pseudo-rapidity with respect to the 
nominal interaction point of plug electrons. 
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TABLE XIII. Number of events with an isolated plug electron and Et > 20 GeV before and after 

tagging. Since the relative contributions of different processes are not affected by the difference in 

the pseudo-rapidity range covered by central leptons and plug electrons, the prediction of supertags 

is derived from Table [V| after normalizing to the same number of SECVTX tags. 

Source H^ + ljet W + 2)et W + 3jet W+ > 4 jet 

Data 1245 243 52 11 

SECVTX tags 15 3 1 1 

Supertags 3 2 

SM prediction 0.9 ± 0.1 0.24 ± 0.03 0.10 ± 0.02 0.10 ± 0.03 



XI. CONCLUSIONS 

We have carried out a study of the heavy flavor content of jets produced in association 
with W bosons. Comparisons of the observed rates of SECVTX (displaced vertex) and SLT 
(soft lepton) tags with standard model predictions, including NLO calculations of single and 
pair produced top quarks, are generally in good agreement. However, we find an excess of 
events which have jets with both SECVTX and SLT heavy flavor tags. The standard model 
expectation for these W+ 2,3 jet events is 4.4 ±0.6 events, while 13 are observed. A detailed 
examination of the kinematic properties of these events finds that they are statistically 
difficult to reconcile with a simulation of standard model processes, which well reproduces 
closely related samples of data. Although obscure detector effects can never be ruled out, 
extensive studies of these events and investigations of larger statistics samples of generic- 
jet data have not revealed any effects which indicate the existence of detector problems or 
simulation deficiencies. We are not aware of any model for new physics which incorporates 
the production and decay properties necessary to explain all features of these events. Work 
is continuing on studies of the present data. With much larger data samples from the Run 
II of the Tevatron, we will be able to explore in greater detail this class of events. 
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APPENDIX A: COMPARISON OF RATES OF SUPERTAGS IN GENERIC-JET 
DATA AND IN THE CORRESPONDING SIMULATION 



Table |XIV| lists rates of tags in generic-jet data and in the corresponding simulation. 
This comparison profits from the measurement of the heavy flavor composition of generic- 
jet data and of the calibration of the herwig generator presented in Ref. O. A summary 
of that study is provided here. Generic-jet data are events collected by requiring at least 
one jet with transverse energy above trigger threshold (i.e. a 20 GeV threshold for JET 20 
data). As usual we consider jets with Et > 15 GeV and pseudo-rapidity \rj\ < 2. We apply 
the additional requirement that at least one of the jets in the event contains two SVX tracks 
and is therefore taggable by SECVTX or JPB. An equal number of 2 — > 2 hard-scattering 
events is simulated using option 1500 of the herwig generator and the MRS (G) parton 



distribution functions ||35| . In the simulation, jets with heavy flavor come from heavy quarks 
in the initial or final state of the hard scattering (flavor excitation and direct production) 
or from gluon splitting. A 13.2% fraction of the simulated jets contains heavy flavor (4.7% 
due to 6-hadrons and 8.5% due to c-hadrons). A 3.5% fraction of the simulated jets contains 
heavy flavor and is tagged by SECVTX (73% of the tagged jets are initiated by a 6-quark 
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and 27% by a c-quark). Jet-probability is more efficient than SECVTX in tagging c-jets. 
A 4.6% fraction of the simulated jets contains heavy flavor and is tagged by jet-probability 
(55% of the tagged jets are initiated by a 6-quark and 45% by a c-quark). 

The heavy flavor production cross sections calculated by herwig have been tuned in 
Ref. [0| to reproduce the pattern of SECVTX and JPB tags observed in generic-jet data. 
herwig gives a good description of the data provided that the direct and flavor excitation 
production cross sections are increased by 1.10 ± 0.16 and the fraction of gluons branching 
to heavy quarks is increased by 1.36 ± 0.22. The accuracy of this calibration is limited by 
our understanding of the tagging efficiencies. The factors required to calibrate simulated 
rates of SECVTX or JPB tags are determined more accurately: 1.1 ± 0.1 for direct and 
flavor excitation production and 1.38 ± 0.09 for gluon splitting. 



Table |XIV| shows agreement also between the number of jets with heavy flavor tagged by 
the SLT algorithm in the data and simulation (the SLT algorithm was not used to calibrate 
the simulation). However the numbers of SLT tags in the data have large errors because 
the ratio of tags due to heavy flavor to mistags is about 1/5. For jets with a supertag 
(SECVTX+SLT or JPB+SLT) the ratio of tags due to heavy flavor to mistags is about 2/1, 
and this allows a good calibration of the efficiency for finding supertags in the simulation. 
We compare ratios of supertags to SECVTX (JPB) tags in the data and the simulation 
in order to cancel the contribution of the uncertainty of the simulated SECVTX (JPB) 
algorithms. Efficiencies for finding SLT tags in jets already tagged by SECVTX or JPB are 



listed in Table KV| . We find that the efficiency for finding supertags in the data is (84 ± 



5)% of the simulated efficiency. The small differences in the tagging efficiency between data 
and simulation in Table KV| do not seem to be caused by a particular flavor type, because 
the relative fractions of b and c-quarks are quite different in jets tagged by SECVTX and 
jet-probability. The uniformity of the data-to-simulation scale factor for finding supertags 
across the three independent generic-jet samples also excludes any large dependence on the 
jet transverse energy. If we combine these three samples, we find that the efficiency for 
finding supertags in the data is (85 ± 5)% of the simulated efficiency for SECVTX tags 
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and (86 ±7)% for JPB tags. Since the heavy flavor composition of generic-jet data with a 
SECVTX tag (73% 6-quarks and 27% c-quarks) is very similar to the composition of W+ > 
2,3 jet events with a SECVTX tag, the excess of W+ 2,3 jet events with a supertag cannot 
be explained by correlations between the SLT and SECVTX algorithms unaccounted for by 
the simulation. 
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TABLE XIV. Number of tags due to heavy flavors observed in generic-jet data and in the 
simulation normalized to the same number of events before tagging. The amount of mistags 
removed from the data is indicated in parenthesis; errors include a 10% uncertainty in the mistag 
evaluation. The error of the number of simulated SLT tags includes the 10% uncertainty on the 
SLT tagging efficiency. This error is not included for simulated SECVTX+SLT and JPB+SLT 
tags as we intend to calibrate the simulation efficiency with the data. 

JET 20 (194,009 events) 

Tag type Data (removed fakes) Simulation 



SECVTX 


4058+92 (616.0) 


4052+143 


JPB 


5542+295 (2801.0) 


5573+173 


SLT 


1032+402 (3962.0) 


826+122 


SLT+SECVTX 


219.8+20 (94.2) 


263+29 


SLT+JPB 


287.3+28 (166.7) 


330+29 




JET 50 (151,270 events) 




Tag type 


Data (removed fakes) 


Simulation 


SECVTX 


5176+158 (1360.0) 


5314+142 


JPB 


6833+482 (4700.0) 


6740+171 


SLT 


1167+530 (5241.0) 


1116+111 


SLT+SECVTX 


347+29 (169.0) 


404+22 


SLT+JPB 


427.5+42 (288.5) 


490+32 




JET 100 (129,434 events) 




Tag type 


Data (removed fakes) 


Simulation 


SECVTX 


5455+239 (2227.0) 


5889+176 


JPB 


6871+659 (6494.0) 


7263+202 


SLT 


1116+642 (6367.0) 


1160+168 


SLT+SECVTX 


377.6+36 (243.4) 


508+35 


SLT+JPB 


451.8+55 (401.2) 


563+34 
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TABLE XV. Fractions of SECVTX and JPB tags with a supertag in generic-jet data and in the 
corresponding simulation. In the simulation the fraction of supertags is slightly higher than in the 
data, independent of the jet transverse energy and the heavy flavor type. 

JET 20 JET 50 JET 100 



Data 
Sim. 

Data/Sim. 



SLT+SECVTX 
SECVTX 

0.054±0.005 
0.065±0.007 
0.83±0.12 



SLT+JPB 
JPB 

0.052±0.006 
0.059±0.005 
0.88±0.13 



SLT+SECVTX 
SECVTX 

0.067±0.006 
0.076±0.004 
0.88±0.09 



SLT+JPB 
JPB 

0.063±0.008 
0.073±0.005 
0.86±0.12 



SLT+SECVTX 
SECVTX 

0.069±0.007 
0.086±0.006 
0.80±0.10 



SLT+JPB 
JPB 

0.066±0.010 
0.077±0.005 
0.86±0.14 



APPENDIX B: CHARACTERISTICS OF THE EVENTS WITH A SUPERJET 



Tables |XVI| and |XVII| list the characteristics of the 13 events with a superjet. Four of 



these events are included in the data set used to measure the top quark mass |18[ . 

Event 41540/127085 in Table gVg is classified in Ref. as a dilepton event. In the 
present analysis, which uses tighter lepton selection criteria, the muon candidate appears to 
be due to punch-through of a stiff track inside the jet with Et = 144.5 GeV. The fit of this 
event yields a top quark mass M top = 158.8 GeV/c 2 . 



The other three events (65581/322592, 67824/281883 and 56911/114159 in Table [XVlTD 



contain an additional jet with Ej> > 8 GeV and \r)\ < 2.4. The fit of these events in Ref. |18 
yields M top = 152.7, 170.1 and 156.7 GeV/c 2 , respectively. 



Table [XVIII| lists the characteristics of the two events found by removing the L2 trigger 



requirement for primary muons. The characteristics of the two additional events with a 



superjet and a primary plug electron are listed in Table XIX 
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TABLE XVI. Characteristics of W+ 2 jet events with a superjet. Jets tagged by the 
SECVTX (SLT) algorithm are labeled SECVTX (SLT). Jet energies are corrected for calorimeter 



n nn-li n pa t*i 1 1PQ pnn m 1 t-nf. 

11U11 1111CCU. 1 LlCo dllQ \J W h \JL 


-cone losses; Et is 


evaluated after these corrections are applied. 








p T (GeV/c) 


V <? 


i> (rad) 




PT (GeV/c) 


V <t 


i> (rad 


Run 46 935 event 266 805 








Run 41540 event 127085 








electron (-) 


29.7 


-0.87 


0.15 


electron (-) 


22.2 


0.84 


0.57 


Jet 1 


49.6 


-0.61 


5.46 


Jet 1 (SECVTX,SLT) 


144.5 


0.11 


6.15 


Jet 2 (SECVTX,SLT) 


41.1 


0.43 


2.70 


Jet 2 


61.5 


-0.54 


3.75 


Et 


19.8 




2.56 


Ejt 


92.1 




3.05 


SLT (fi-) 


3.8 


0.52 


2.63 


SLT 


8.8 


0.18 


6.14 


Z vvtx (cm) 


-20.71 






Z vr tx (cm) 


-4.77 






Run 41627 event 87219 








Run 61 167 event 368 226 








electron (-) 


78.5 


0.90 


4.56 


electron (+) 


22.2 


0.76 


1.37 


Jet 1 


68.7 


0.11 


3.03 


Jet 1 (SECVTX,SLT) 


99.3 


-0.16 


1.86 


Jet 2 (SECVTX,SLT) 


58.0 


0.50 


1.23 


Jet 2 (SECVTX) 


68.1 


0.93 


5.48 


_j 

Et 


47.4 




0.23 


Ejt 


36.0 




3.61 


SLT (/x-) 


10.4 


0.47 


1.26 


SLT (p-) 


24.7 


-0.11 


1.92 


Z vrtx (cm) 


-28.11 






Z vrtx (cm) 


-14.20 






Run 65 384 event 266 051 








Run 65 741 event 654870 








electron (-) 


21.9 


0.68 


0.65 


muon (+) 


47.2 


0.79 


6.01 


Jet 1 


73.9 


2.06 


0.33 


Jet 1 (SECVTX,SLT) 


109.4 


0.63 


4.58 


Jet 2 (SECVTX,SLT) 


59.0 


0.61 


4.92 


Jet 2 (SECVTX) 


63.9 


0.31 


2.87 


Et 


96.2 




3.02 


Ejt 


95.8 




1.31 


SLT (//+) 


10.9 


0.61 


4.80 


SLT (e+) 


7.1 


0.76 


4.61 


Z vrtx (cm) 


-24.24 






Z vrtx (cm) 


- 14.20 






Run 46 357 event 511399 








Run 69 520 event 136 405 








muon (-) 


22.2 


-0.82 


5.64 


electron (-) 


20.4 


i.Ul 


U.ZO 


Jet 1 


58.2 


-0.20 


6.10 


Jet 1 


44.2 


-0.61 


5.57 


Jet 2 (SECVTX,SLT) 


41.2 


0.27 


2.84 


Jet 2 (SECVTX,SLT) 


32.7 


-0.88 


2.71 


Et 


39.8 




2.89 


Et 


27.5 




2.42 


SLT (//+) 


15.2 


0.25 


2.96 


SLT (//+) 


11.3 


-0.87 


2.71 


SLT (e~) 


7.1 


0.38 


2.89 


^vrtx (cm) 


-12.36 






Z vrtx (cm) 


-24.13 
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TABLE XVII. Characteristics of the W+ 3 jet events with a superjet. Jets tagged by the 
SECVTX (SLT) algorithm are labeled SECVTX (SLT). Jet energies are corrected for calorimeter 



nrin hviQQV'n 1QP nnH nnf at 
IlUIl-llIlUclIlllcb dllLl UU.L-U1' 


-CUIlC lUootJo. 


E ic 


evaluated after these corrections are applied. 




i> (rad 




p T (GeV/c) 


f] <i 


!> (rad) 




p T (GeV/c) 


V (t 


Run 56 911 event 114159 








Run 61548 event 284898 








electron (-) 


58.5 


0.92 


0.83 


muon (+) 


20.3 


-0.54 


3.00 


Jet 1 


203.4 


-0.13 


2.93 


Jet 1 


72.4 


0.55 


1.96 


Jet 2 (SECVTX,SLT) 


65.5 


0.82 


5.80 


Jet 2 (SECVTX) 


64.9 


0.44 


3.94 


Jet 3 


24.1 


0.60 


0.00 


Jet 3 (SECVTX,SLT) 


58.7 


0.07 


5.73 


Et 


61.5 




5.41 


Et 


38.8 




0.02 


SLT (//+) 


9.3 


0.77 


5.75 


SLT (e") 


14.6 


0.09 


5.83 


Z vrtx (cm) 


-13.89 






Z vr tx (cm) 


16.38 






Run 65 581 event 322 592 








Run 67824 event 281883 








muon (-) 


21.4 


0.57 


6.00 


electron (+) 


52.3 


-0.16 


3.64 


Jet 1 (SECVTX) 


146.3 


-0.56 


1.21 


Jet 1 (SECVTX) 


78.8 


-0.49 


0.90 


Jet 2 (SECVTX,SLT) 


65.8 


0.51 


3.38 


Jet 2 


66.3 


0.69 


5.83 


Jet 3 


29.7 


1.50 


4.68 


Jet 3 (SECVTX,SLT) 


55.8 


0.68 


2.09 


Et 


70.2 




3.78 


Et 


57.6 




4.30 


SLT 


31.3 


0.58 


3.34 


SLT (//-) 


7.2 


0.88 


1.97 


Z vrtx (cm) 


5.54 






Z vrtx (cm) 


-10.56 






Run 46 818 event 221912 
















muon (-) 


48.2 


1.02 


2.36 










Jet 1 (SECVTX,SLT) 


55.4 


-0.02 


2.96 










Jet 2 


41.7 


0.27 


5.08 










Jet 3 


35.3 


0.82 


5.68 










Et 


22.3 




0.30 










SLT (//+) 


10.5 


0.06 


2.93 











Z vrtx (cm) -17.28 
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TABLE XVIII. Characteristics of the W+ 2 jet events with a superjet rescued by removing the 
L2 trigger requirement. 

Pt (GeV/c) 77 4> (rad) 

Run 61525 event 116 807 

muon (+) 50.5 0.48 0.58 

Jet 1 (SECVTX,SLT) 66.3 0.10 4.45 

Jet 2 36.8 -0.71 1.87 

Et 22.2 4.30 

SLT [pT) 11.2 0.11 4.36 

Z vrtx (cm) 5.72 
Run 68 592 event 250 386 

muon (-) 57.5 -0.07 4.69 

Jet 1 60.6 -1.08 4.09 

Jet 2 (SECVTX,SLT) 42.5 -0.17 1.44 

Jet 3 32.5 1.58 0.97 

Et 36.1 1.12 

SLT (fi+) 7.9 -0.21 1.42 

^vrtx (cm) 14.48 



TABLE XIX. Characteristics of the W+ 2,3 jet events with a superjet found in the plug electron 
sample. 

Pt (GeV/c) 77 4> (rad) 

Run 69 941 event 66 919 

electron (-) 43.4 -1.33 0.77 

Jet 1 (SECVTX,SLT) 84.5 -0.12 4.09 

Jet 2 50.7 1.99 1.29 

Et 11.6 4.53 

SLT (//+) 13.5 -0.09 4.06 

•^vrtx (cm) 16.00 
Run 58 202 event 109 847 

electron (+) 65.9 1.45 1.43 

Jet 1 32.6 0.28 4.84 

Jet 2 (SECVTX,SLT) 30.8 -0.75 4.38 

Et 12.5 4.73 

SLT (e-) 3.5 -0.63 4.49 

^vrtx (cm) -18.08 
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